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Abstract 

We show that with high probability, random rank 1 matrices over 
a finite field are in (linearly) general position, at least provided their 
shape k X I is not excessively unbalanced. This translates into saying that 
the dimension of the >i=-product of two [n, k] and [n, Z] random codes is 
equal to min(n, kl), as one would have expected. Our work is inspired by 
a similar result of Cascudo-Cramer-Mirandola-Zemor [1] dealing with *- 
squares of codes, which it complements, especially regarding applications 
to the analysis of McEliece-type cryptosystems mm- We also briefly 
mention the case of higher ^-powers, which require to take the Frobenius 
into account. We then conclude with some open problems. 


1 Introduction 

Many fundamental problems in information theory and in theoretical computer 
science can be expressed in terms of the structure of linearly independent and 
generating subsets of a set in a vector space, as illustrated by [10] and the 
subsequent success of matroid theory. In this context the importance of the 
following definition is self-evident: 

Definition 0. Let V be a finite-dimensional veetor space, over an arbitrary 
field. We say a set X QV is in general position if any finite subset S X has 
its linear span (S) of dimension 

dim(5') = mind^l, dim!/). 

This means that there are no more linear relations than expected between 
elements of X-. any S' C X of size |S| < diml^ is linearly independent, and any 
S C X of size |S| > dim R is a generating set in V. 

This requirement is quite strong, and weaker variants have been considered. 
We can cite at least three of them. 

The first one is to introduce thresholds. We say X is in (a, &)-general position 
if any S C X of size |S| < a is linearly independent, and any S C X of size 
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[S'! > 6 is a generating set in V. This notion should look very familiar to coding 
experts. Indeed one shows easily: 

Lemma 1. Let C be a q-ary [n, k] code, with generating matrix G. Set F = Fj 
and let X C V be the set of columns of G. Then X is in (a, b)-general position, 
with a = dniin(C‘*“) — 1 and b = n — dnun(C) + 1. 

A second one is to allow a small gap g from the expected dimension: we say 
X is in (^-almost general position if for any S Q X we have 

dim(S') > minds'!, dim y) — g. 

This means allowing up to g more linear relations than expected. There is an 
obvious link with the previous notion: 

Lemma 2. If X QV is in (a, b)-general position, then it is in g-almost general 
position for g = min(dim V — a, b — dim V). 

We leave it to the reader to combine Lemma [T] and Lemma [2] and give a 
coding-theoretic interpretation of this integer g (or a geometric interpretation 
in case G is an AG-code). 

Last, our third variant is probabilistic, allowing a small proportion of S to 
fail in Definition [Ol In fact, rather than subsets of X, it will be easier to consider 
sequences of elements of X , possibly with repetitions. For this we will assume 
that X is equipped with a probability distribution ^. A natural choice when X 
is finite would be to take the uniform distribution, however more general ^ will 
be allowed. Then, measuring how close X C y is to being in general position 
reduces to the following: 

Problem 3. Let n > I, and Ui,...,Un random elements of X (understood: 
independent, and distributed according to ). Give bounds on the “error prob¬ 
ability” 

P[dim(ui,..., Un) < min(n, dim y)]. 

In this work we address this problem for V = a matrix space, and 

AT C y the set of matrices of rank 1. 

Understanding the linear span of families of rank 1 matrices is especially 
important regarding the theory of bilinear complexity (or equivalently, that of 
tensor decomposition). Indeed, computing the complexity of a bilinear map 
(or the rank of a 3-tensor) reduces to the following [2] [2] [7] [8] : given a linear 
subspace W C F^^^, find a family of rank 1 matrices of minimal cardinality 
whose linear span contains W. 

Another motivation comes from the theory of ^-products of codes, and 
in particular its use in a certain class of attacks mi against McEliece-type 
cryptosystems. Given words c = (ci,..., c„), c' = G F", we let 

c * c' = (cic(,..., c„c(j) G F^ be their componentwise product. Then [5] if 
G, C' C F^ are two linear codes of the same length, their product C * C" C F^ 
is defined as the linear span of the c * c' for c G G, c' € G'. We can also define 
the square = C * C, and likewise for higher powers 
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Setting k = dimC and I = dimC', it is then easily seen 

dimC*C" < kl, 
dimC^^^ < k(k + l)/2, 

and in fact for small fc, I and random C, C one expects these inequalities to be 
equalities. For the second inequality, this is proved in [3]. For the first inequality, 
we will see this reduces to our solution of Problem [3] for rank 1 matrices. 

So, together, [4] and our results support the heuristic at the heart of the 
aforementionned attacks against McEliece-type cryptosystems. Indeed, the very 
principle of these attacks is to uncover the hidden algebraic structure of an 
apparently random code (which serves as the public key) by identifying subcodes 
for which equality fails in these inequalities (for instance, the dimension of the 
product behaving additively rather than multiplicatively). 


2 Generic approach 

Here V is an abstract vector space of dimension m over F,, and X C V an 
arbitrary subset. We may assume X spans V. We are interested in the function 

P(n) = P[dim(ui,..., Un) < min(n, dim F)]. 

Clearly it is unimodal, more precisely it is increasing for n <m and decreasing 
for n > m. Now we study each of these two cases in more detail. 

2.1 Case n> m. 

We have dim(ui,..., Un) < to iff Ui,..., Un are contained in an hyperplane H 
of V. Using the union bound and the independence of the Ui we get at once: 

Proposition 4. We have 


P(»^) < ^ P[ui,..., u„ e ii] 

H 

= ^P[ui eFT]” 

H 

where H ranges over hyperplanes ofV. 

This bound is exponentially small. More precisely, set 

p = nmxP[ui € H] 

(for instance p = maxij|X n iF|/|X| if ^ is uniform distribution). We then see 
immediately: 
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Corollary 5. For all n > m we have 

cpu-m < < c'p"-™. 

where c = and c! = P[ui G H]^. 

It should be noted that c, c', p depend on V and X. So, part of the job will 
be to make these constants more explicit when V and X will be specified. 
Another interesting fact is that the RHS in Proposition 0] is 

^P[ui,...,u„ G H] =E[|{ff; G H}\], 

H 

the expected value of the number of hyperplanes containing Ui,..., Un. How¬ 
ever, this number is precisely ^ , where d = codim(ui,..., Un). This allows 

us to combine our second and tliird variants of the notion of general position: 

Proposition 6. For 0 < g < min(m,n) we have 

P[dim(ui,..., u„) < m - g] < ^ ^ 

(with c',p as above), and also 


P[dim(ui,..., Un) <rn- g] < ^P[ui G IP]” 

w 

where W CV ranges over subspaces of codimension g + 1- 

Proof. The first inequality follows from the discussion above, using Markov’s 
inequality as in 01 Prop. 5.1]. The second is a direct approach using the union 
bound similar to that of Proposition 01 □ 


Which of these two bounds is stronger, and which is more tractable, certainly 
depends on V and X. Note also that the bounds remain valid even without the 
assumption m < n. 

We illustrate what precedes for X = V = F™ with uniform distribution (this 
will be used later). We introduce the converging infinite product 


J>1 


Numerically, Cq < C 2 
We let 
spaces in F] 


3.463. 


= n 


l<j<r 
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denote the number of r-dimensional sub- 
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Proof. From < (1 - □ 

Proposition 8 . For 0 < r < min(m, n) and random Ui,..., u„ G F™ uniformly 
distributed, we have 

P[dim(ui,..., u„) < r] < 

Proof. Follows from what precedes, using P[ui G W] = for dimVF = 

r. □ 

2.2 Case n < m. 

From now on we will suppose {X, F£') is homothety invariant: given any A G F^, 
then for random u G X, we also have An G X, with the same distribution ^. 
We say a vector z = (Ai,..., A„) G F" is a linear relation for Ui,..., u„ if 

AiUi + ■ • • + An,Un = 0 . 

Also introduce the random variable 

Sn = Ui + • • • + Uji G V. 

Lemma 9. For any z G F^ of Hamming weight w, we have 

P[z is a linear relation for Ui,..., Un] = PK= 0 ]. 

Proof. We may suppose z has support w}, and we conclude since Ui and 

AiUi have same distribution for Ai 7 ^ 0 . □ 

Proposition 10. We have 

PH< E (”)(9-ir“'PK = 0 ] 

Proof. Union bound, as in Proposition|4](note that we may count linear relations 
only up to proportionality). □ 

Likewise, Markov’s inequality gives, for any g > 0, 

P(dim(ui,...,u„)<n- 5 ) < ^ (")(<?- l)’"P[sw = 0]. 

i(;>l 

In these sums we expect the contribution of linear relations of large weight 
should stay under control thanks to: 

Proposition 11 . As w —>■ 00 we have 

P[sw = 0 ] ^ , 

except for q = 2 and X contained in the translate of an hyperplane, in which 
case we have P[sw = 0 ] for odd w, and P[sw = 0 ] —>■ for even w ^ 00 . 
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Proof. We treat first the case q > 2, so there is a A 0,1 in Fg. The Sw form 
a random walk on the finite commutative group V. Seen as a Markov chain, it 
is irreducible, because X spans V (as a vector space, but also as a group, since 
X is homothety-invariant). Moreover it is aperiodic, because the zero vector 
can be written as a sum of 2 elements of X {e.g. s + (—s)), and also as a sum 
of 3 elements of X {e.g. (1 — A)s + (—s) + As). So it converges to its unique 
stationnary distribution, which can only be uniform. 

The case <7 = 2 is similar, with a tweak on aperiodicity. □ 

3 Rank 1 matrices 

A matrix u € is of rank 1 iff it can be written u = pq^ for column vectors 
p G Fg \ {0}, q G Fg \ {0}. Moreover these p, q are uniquely determined up to a 
scalar. This means, choosing random p G Fg \ {0}, q G Fg \ {0} uniformly, and 
setting u = pq^, gives a random matrix of rank 1 with uniform distribution. 
Actually we will use a slightly different model. Let 

^fexz = {uGFf'; rku<l} 

be the set of rank 1 matrices together with the zero matrix. Pick random p G Fg, 
q G Fg uniformly (possibly zero), and set u = pq^. This gives our distribution 
if on Xkxi- 

Note that if u G Xkxi is distributed according to if, then conditioning 
on the event u 7 ^ 0 gives back the uniform distribution on matrices of rank 1 . 
Conversely, if & is a Bernoulli variable of parameter P[& = 1] = {l — q~^){l — q~^), 
and if u is a random uniformly distributed matrix of rank 1, then bu G X^xi is 
distributed according to if. Moreover, replacing Ui, ... , u„ with 5iUi, ..., 6 „Un 
can only decrease the dimension of their linear span. As a consequence, any 
upper bound on P(n) for (Affcx/jif*) will also be an upper bound for uniformly 
distributed matrices of rank 1 . 

Lemma 12. (i) Every linear form on Fg is of the form = Tr(B^-) for 

a uniquely determined B G Fg^b 

(ii) The number 0 / B G Fg^^ of rank r is 

|GL,(Fg)| < Cgg'-('=+'-’'). 

(Hi) Given B G Fg^* of rank r, then for random u = pq^ in Xkxi we have 
p[;B(u) = o] = i(i + ^). 

Proof. Point (i) is clear. For point (ii) we view B as a linear map Fg —> Fg, and 
we note that it is entirely determined by its kernel kerB C Fg of codimension 
r, its image imB C Fg of dimension r, and the isomorphism Fg/kerB ~ imB 
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it induces. This gives the formula of the LHS, and the upper bound works as 
in the proof of Lemma 0 For (iii) we note Zb(u) = 0 means P^Bq = 0 , which 
happens precisely when p^B = 0 (of probability q~'') or when q is orthogonal 
to p^B 7 ^ 0 (of probability — q~^))- □ 

For some of our results we will restrict to matrices whose long side grows 
at most exponentially in the short side. More precisely, for any s, k > 0, we 
introduce the parameter space 


P(e,K) 


|(A:, 0 ; 2<k<l< 


(g- l)fcj ■ 


Now we fix a K > 0 small enough so that gF k,) > 1 + 3-A (for instance 
K = 0.23 works for any g), as well as some 0 < e < 1. 

Theorem 13. Let {k, 1) G V{£, k) and n > kl. Then for random ui,..., Un S 
Xk-xi we have 

P[ui,..., u„ don’t span < c" 

with p = i (l + and c" = + T^) ■ 

Proof. We apply Corollary [5l where from Lemma [12] we get p = ^ 
and 


c' < 



l<r<fe 





( 1 +^)" 



kl 


We set ro = [nk] and split this last sum in two. 

First, for r < vq we have {k — r){l — r) > (1 — nf^kl + (ro — r) and 1 + < 

1 + ^, so, by our condition on k, < -^4^. 

On the other hand, for r > ro we have ^ — 

1 < 1 
1 fci(9-i) ^ i_e- 

We deduce: 


+ ^ y _ i _ 

g — 1 1 ^ l — £ ^ Qik-r){l-r) 

\l<r<ro^ ro<r<k^ 



□ 


Given k < I and random Ui e Xkxi, recall for all w > 1 we set Sw = 

Ui H-hUw e 
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Theorem 14. {%) For 1 < w < k + I we have 


= 0 ]< 


2qC,/(q - 1) 

Qkwj2 


(a) For w > k + I we have 


= 0 ]< 




M 


Proof. Write Ui = PiQi^ with pi G F^, qi e uniform. Let G be the k x w 
matrix whose columns are pi,...,pw, and let G F“ be its rows. 

Likewise let G' be the I x w matrix whose columns are qi,..., q^, and let 
yi, • ■ •) yi G F^ be its rows. Note these x’s and y’s are uniform and independent. 
Also our key observation is that Sw = 0 iff (xi,..., Xk) -L (yi,..., yi) in F^. 
Now we condition on dim(yi,..., yi). 

By Proposition [5] we have P[dim(yi,..., yi) = e] < Also, 

P[(xi,...,Xk)A(yi,...,yi)|dim(yi,...,yi)=e] = This gives 


P[sw = 0] < C, ^ 

0<e<min(Z,it;) 


where /(e) = ke + {I — e)(w — e). This function / attains its minimum at 
eo = (I + w — k)l2, from which we deduce, for 0 < e < min(/, w): 

{ kw + {w — e) > kw/2 + {w — e) ioi w < I — k 

/(eo) + [|e — eolj > kw/2 + [|e — eo|J ior I — k < w < k + I 
kl + {e — k){w — 1) ioT w > k + 1. 

The first two cases together give point (i), while the third gives point (ii). □ 

Theorem 15. Let {k, 1) G V{e, ^) and n < kl. Then for random Ui,..., G 

Afexi we have 


P[ui,...,Un lin. dependent] < ^ 

Proof. Split the sum in Proposition [TU] in two: for w < k F I use Theorem lllf if 
and (”) < (kl)'^] ioi w > k + I use Theorem [T^ iiL □ 


4 Products of codes 

By a generating matrix for a linear code C we mean any matrix G whose row 
span is G. We allow G to have more than dim C rows. 

Consider random G G F^^”, G' G F^^" (uniform distribution), generating 
matrices for C, G' C F^, so dim G < k, dim C" < 1 . Denote by pi ,..., Pn G Fj 






the columns and by xi,..., Xk G the rows of G. Denote by qi,..., qn G 
the columns and by yi,..., yi G F” the rows of G'. 

Identify the matrix space F^^* with F^* . 

The product C * G' and its generating matrix G G admit the 

following equivalent descriptions [5]: 

(i) G has rows all products Xi * yj 

(ii) C *G' is the projection olG ®C' on the diagonal 

(hi) G has columns the rk < 1 matrices Piqi^,..., Pnqn^ 

(iv) C * G' is the image of the evaluation map 
ev: Bilin(F^ x F^) —5- F” 

B (B(pi,qi),...,B(p„,q„)). 

From description (iii) we can translate our Theorems [13] and [15] Recall 
> 1 + 2^^ and 0 < e < 1. 

Theorem 16. For {k,l) G 7^(e, k) and n > kl, we have 

F[dimG*C' < kl] < c" 

with p = i (l + and c" = (l + . 

Theorem 17. For {k, 1) G 7^(e, and n < kl, we have 

P[dimC * G' <n]< 

(g- 1)2 -e 

Note that if fc ^ oo and klje^l"^ —0 (for instance if I is polynomial in k), 
we can set e = {q — l)kllq^^‘^ —>• 0. 

Still, we can derive an unconditional result, valid for any {k,l). Recall the 
maximum distance dmax of a linear code is the maximum weight of a codeword. 

Theorem 18. For any {k, 1), and k + I < n < kl, we have 

P[d„,ax(C' * GY >k + l]< ■ 

Proof. Union bound for P[31in. rel. of weight > k + 1], which means keep only 
terms w > k + I in Proposition [TOl and use only part (ii) of Theorem |T4| □ 

So, with high probability (C =i= G')^ has dmax < k + 1. This is a strong 
restriction (for instance it also implies dim<fc + /). 
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5 Squares and higher powers 

Let C have generating matrix G G with columns Pi,..., Pn G and 

rows xi,..., Xk G F^. As above, the s-th power and its generating matrix 
G admit the following equivalent descriptions: 

(i) G has rows all ^-monomials of degree s in the Xi 

(ii ) is the projection of C®® on the diagonal 

(iii) G has columns the elementary tensors pi®®,..., Pn®® 

(iv) c<®) is the image of the evaluation map 

ev: ¥q[ti,...,tk]s —^ F” 

P ^ (-P(Pl),---,-P(Pn))- 

(Where Rg denotes the s-th homogeneous component of R.) 

We deduce at once dimC^®^ < min(n, For s = 2 it is shown in 

[3] that for random such G, with high probability there is equality: dimC*^^^ = 
min(n, (which could in turn be translated into a general position result 

for rank 1 symmetric matrices). It is interesting to note that not having to face 
unbalanced {k, 1) made it easier for these authors to deal with short relations, 
hence to control dmin in [H Prop. 2.4]. By contrast, in our setting, 

independence of G and C" made it easier to deal with long relations, hence to 
control di„ax(C' * G')^ in Theorem [TSl 

Concerning higher powers, one should be careful of the: 

Proposition 19. For s > q we always have strict inequality 

dimc'<®> < 


More precisely, we have 

dimC^®^ < min(n, Xq(fc, s)) 

where m App. A]: 

Xq{k, a) = dim5'|,„(,F^ = dim(F,[ti,.. .,tk]/{t‘}tj - tit]))^. 

Proof. The map * is Frobenius-symmetric, so in (ii) the projection C®® —>■ C^®^ 
factors through Alternatively, in (iv), ker(ev) contains all multiples of 

the tfij — tit'l. □ 

6 Open problems 

In our probabilistic model we considered random matrices of the form Ui = 
PiQi^ for column vectors pi G F^, qi G F^ possibly zero. However, as already 
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noted, it is perhaps more natural to restrict these Pi,qi to stay nonzero, so the 
Ui become uniformly distributed rank 1 matrices. Considering the pi (resp. qi) 
as the columns of a generating matrix of a code C (resp. C"), this translates into 
considering only codes with full support—although of dimension possibly less 
than k (resp. 1). Then, a further model would be to request these generating 
matrices having full rank. That means: take C (resp. C) uniformly distributed 
in the set of [n, k] (resp. [n, 1]) codes with full support. Clearly this could only 
help get sharper bounds. In particular: 

Problem 20. Do these alternative models allow to relax our condition P(e, k) ? 
Do they give bounds valid without any restriction on (fc, 1) ? 

Proposition [TT] suggests that the fate of long relations should essentially not 
depend on the probabilistic model. On the other hand, for short relations, 
it certainly does. In fact, relations of weight less than k + I are perhaps less 
tractable because, for such a length, C and C necessarily intersect. This leads to 
the following, which would encompass both our results (remove the conditioning) 
and those of [1] (set i = k = 1): 

Problem 21. For any n,k,l,i, j, estimate the conditional probability 

P[dimC' * C =j\ dime n C = i]. 

We saw the existence of relations of length w is related to the distribution of 
Sw = ui + • ■ • + Uw. When the Ui are uniformly distributed matrices of rank 1, 
this reduces to: 

Problem 22. In what is the number 

of decompositions of a matrix of rank r as an ordered sum of w matrices of 
rank 1 ? 

It is easily seen that this number is well defined, which means, it is the same 
for all such matrices of rank r. Of special importance are the w), which 

control P[sw = 0]. We leave it as an exercise to link their computation with 
that of the weight distribution of the code {Sk <8> where Sk is the fc] 

q-aiy simplex code. 

Considering powers of a code leads similarly to count families of elementary 
s-th power tensors summing to zero. 

Problem 23. For fixed s, and a random [n,k] code C, estimate the probability 
P[dimC'^'’^ = min(n, X 5 (fc, s))]. 

And then, what if we also let s vary? 

It is interesting to note that, up to code equivalence, any [n, k] code C with 
full support can be obtained from the simplex code Sk by deleting and repeating 
columns. Then is obtained from slf'^ by deleting and repeating the same 
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(s) 

columns. Some authors also call ' the s-th order projective Reed-Muller code 
(in fc variables); it has dimension Xqik,s). As above, we can split our Problem 
in two cases: for n > s); we’re interested in relations between rows of the 

(s) 

generating matrix of C, which is linked to the weight distribution of , while 
for n < s), we’re interested in relations between columns, which is linked 

to its dual weight distribution. 

Last, it is the author’s opinion that considering only the dimension of prod¬ 
ucts is not entirely in the spirit of coding theory. In fact, it is a purely alge¬ 
braic problem, where (Fg,=i=) could be replaced by any space equipped with 
a bilinear inner composition law. See [1] for an example where the space 
is an extension field with its natural multiplication. However, what is gen¬ 
uinely coding-theoretic is to consider minimum distance beside dimension. It is 
well known that, asymptotically, a random code lies on the Gilbert-Varshamov 
bound i? = I — H{5). It is then very natural to ask: 

Problem 24. Does the product of two random codes, or the square or higher 
powers of a random code, lie on the GV bound? 

Observe that the answer would be negative if the question were stated with 
tensor product instead of *-product. 
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